Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
BioData Min ; 11: 23, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30410580

RESUMO

BACKGROUND: ReliefF is a nearest-neighbor based feature selection algorithm that efficiently detects variants that are important due to statistical interactions or epistasis. For categorical predictors, like genotypes, the standard metric used in ReliefF has been a simple (binary) mismatch difference. In this study, we develop new metrics of varying complexity that incorporate allele sharing, adjustment for allele frequency heterogeneity via the genetic relationship matrix (GRM), and physicochemical differences of variants via a new transition/transversion encoding. METHODS: We introduce a new two-dimensional transition/transversion genotype encoding for ReliefF, and we implement three ReliefF attribute metrics: 1.) genotype mismatch (GM), which is the ReliefF standard, 2.) allele mismatch (AM), which accounts for heterozygous differences and has not been used previously in ReliefF, and 3.) the new transition/transversion metric. We incorporate these attribute metrics into the ReliefF nearest neighbor calculation with a Manhattan metric, and we introduce GRM as a new ReliefF nearest-neighbor metric to adjust for allele frequency heterogeneity. RESULTS: We apply ReliefF with each metric to a GWAS of major depressive disorder and compare the detection of genes in pathways implicated in depression, including Axon Guidance, Neuronal System, and G Protein-Coupled Receptor Signaling. We also compare with detection by Random Forest and Lasso as well as random/null selection to assess pathway size bias. CONCLUSIONS: Our results suggest that using more genetically motivated encodings, such as transition/transversion, and metrics that adjust for allele frequency heterogeneity, such as GRM, lead to ReliefF attribute scores with improved pathway enrichment.

2.
Genes Immun ; 17(4): 244-50, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27052692

RESUMO

Expression quantitative trait loci (eQTL) studies have functionalized nucleic acid variants through the regulation of gene expression. Although most eQTL studies only examine the effects of single variants on transcription, a more complex process of variant-variant interaction (epistasis) may regulate transcription. Herein, we describe a tool called interaction QTL (iQTL) designed to efficiently detect epistatic interactions that regulate gene expression. To maximize biological relevance and minimize the computational and hypothesis testing burden, iQTL restricts interactions such that one variant is within a user-defined proximity of the transcript (cis-regulatory). We apply iQTL to a data set of 183 smallpox vaccine study participants with genome-wide association study and gene expression data from unstimulated samples and samples stimulated by inactivated vaccinia virus. While computing only 0.15% of possible interactions, we identify 11 probe sets whose expression is regulated through a variant-variant interaction. We highlight the functional epistatic interactions among apoptosis-related genes, DIABLO, TRAPPC4 and FADD, in the context of smallpox vaccination. We also use an integrative network approach to characterize these iQTL interactions in a posterior network of known prior functional interactions. iQTL is an efficient, open-source tool to analyze variant interactions in eQTL studies, providing better understanding of the function of epistasis in immune response and other complex phenotypes.


Assuntos
Apoptose/genética , Epistasia Genética , Locos de Características Quantitativas , Varíola/genética , Software , Adolescente , Adulto , Proteínas Reguladoras de Apoptose , Proteína de Domínio de Morte Associada a Fas/genética , Proteína de Domínio de Morte Associada a Fas/metabolismo , Feminino , Redes Reguladoras de Genes , Humanos , Peptídeos e Proteínas de Sinalização Intracelular/genética , Peptídeos e Proteínas de Sinalização Intracelular/metabolismo , Masculino , Proteínas Mitocondriais/genética , Proteínas Mitocondriais/metabolismo , Varíola/imunologia , Vacina Antivariólica/imunologia , Proteínas de Transporte Vesicular/genética , Proteínas de Transporte Vesicular/metabolismo
3.
Genet Epidemiol ; 37(6): 614-21, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23740754

RESUMO

Open source tools are needed to facilitate the construction, analysis, and visualization of gene-gene interaction networks for sequencing data. To address this need, we present Encore, an open source network analysis pipeline for genome-wide association studies and rare variant data. Encore constructs Genetic Association Interaction Networks or epistasis networks using two optional approaches: our previous information-theory method or a generalized linear model approach. Additionally, Encore includes multiple data filtering options, including Random Forest/Random Jungle for main effect enrichment and Evaporative Cooling and Relief-F filters for enrichment of interaction effects. Encore implements SNPrank network centrality for identifying susceptibility hubs (nodes containing a large amount of disease susceptibility information through the combination of multivariate main effects and multiple gene-gene interactions in the network), and it provides appropriate files for interactive visualization of a network using tools from our online Galaxy instance. We implemented these algorithms in C++ using OpenMP for shared-memory parallel analysis on a server or desktop. To demonstrate Encore's utility in analysis of genetic sequencing data, we present an analysis of exome resequencing data from healthy individuals and those with Systemic Lupus Erythematous (SLE). Our results verify the importance of the previously associated SLE genes HLA-DRB and NCF2, and these two genes had the highest gene-gene interaction degrees among the susceptibility hubs. An additional 14 genes previously associated with SLE emerged in our epistasis network model of the exome data, and three novel candidate genes, ST8SIA4, CMTM4, and C2CD4B, were implicated in the model. In summary, we present a comprehensive tool for epistasis network analysis and the first such analysis of exome data from a genetic study of SLE.


Assuntos
Epistasia Genética , Redes Reguladoras de Genes , Lúpus Eritematoso Sistêmico/genética , Software , Algoritmos , Exoma , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Cadeias beta de HLA-DR/genética , Humanos , Desequilíbrio de Ligação , NADPH Oxidases/genética , Sialiltransferases/genética
4.
Transl Psychiatry ; 2: e154, 2012 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-22892719

RESUMO

Most pathway and gene-set enrichment methods prioritize genes by their main effect and do not account for variation due to interactions in the pathway. A portion of the presumed missing heritability in genome-wide association studies (GWAS) may be accounted for through gene-gene interactions and additive genetic variability. In this study, we prioritize genes for pathway enrichment in GWAS of bipolar disorder (BD) by aggregating gene-gene interaction information with main effect associations through a machine learning (evaporative cooling) feature selection and epistasis network centrality analysis. We validate this approach in a two-stage (discovery/replication) pathway analysis of GWAS of BD. The discovery cohort comes from the Wellcome Trust Case Control Consortium (WTCCC) GWAS of BD, and the replication cohort comes from the National Institute of Mental Health (NIMH) GWAS of BD in European Ancestry individuals. Epistasis network centrality yields replicated enrichment of Cadherin signaling pathway, whose genes have been hypothesized to have an important role in BD pathophysiology but have not demonstrated enrichment in previous analysis. Other enriched pathways include Wnt signaling, circadian rhythm pathway, axon guidance and neuroactive ligand-receptor interaction. In addition to pathway enrichment, the collective network approach elevates the importance of ANK3, DGKH and ODZ4 for BD susceptibility in the WTCCC GWAS, despite their weak single-locus effect in the data. These results provide evidence that numerous small interactions among common alleles may contribute to the diathesis for BD and demonstrate the importance of including information from the network of gene-gene interactions as well as main effects when prioritizing genes for pathway analysis.


Assuntos
Transtorno Bipolar/genética , Caderinas/genética , Epistasia Genética , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla/métodos , Algoritmos , Inteligência Artificial , Estudos de Coortes , Predisposição Genética para Doença , Variação Genética , Humanos , Modelos Lineares , Polimorfismo de Nucleotídeo Único
5.
Genes Immun ; 13(6): 469-73, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22622198

RESUMO

Vast diversity in the antibody repertoire is a key component of the adaptive immune response. This diversity is generated centrally through the assembly of variable, diversity and joining gene segments, and peripherally by somatic hypermutation and class-switch recombination. The peripheral diversification process is thought to only occur in response to antigenic stimulus, producing antigen-selected memory B cells. Surprisingly, analyses of the variable, diversity and joining gene segments have revealed that the naïve and memory subsets are composed of similar proportions of these elements. Lacking, however, is a more detailed study, analyzing the repertoires of naïve and memory subsets at the level of the complete V(D)J recombinant. This report presents a thorough examination of V(D)J recombinants in the human peripheral blood repertoire, revealing surprisingly large repertoire differences between circulating B-cell subsets and providing genetic evidence for global control of repertoire diversity in naïve and memory circulating B-cell subsets.


Assuntos
Diversidade de Anticorpos/genética , Memória Imunológica/genética , Adulto , Subpopulações de Linfócitos B/imunologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoglobulina G/genética , Imunoglobulina M/genética , Recombinação V(D)J
6.
Genes Immun ; 12(6): 457-65, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21368772

RESUMO

Host genetic variation, particularly within the human leukocyte antigen (HLA) loci, reportedly mediates heterogeneity in immune response to certain vaccines; however, no large study of genetic determinants of anthrax vaccine response has been described. We searched for associations between the immunoglobulin G antibody to protective antigen (AbPA) response to Anthrax Vaccine Adsorbed (AVA) in humans, and polymorphisms at HLA class I (HLA-A, -B, and -C) and class II (HLA-DRB1, -DQA1, -DQB1, -DPB1) loci. The study included 794 European-Americans and 200 African-Americans participating in a 43-month, double-blind and placebo-controlled clinical trial of AVA (clinicaltrials.gov identifier NCT00119067). Among European-Americans, genes from tightly linked HLA-DRB1, -DQA1, -DQB1 haplotypes displayed significant overall associations with longitudinal variation in AbPA levels at 4, 8, 26 and 30 weeks from baseline in response to vaccination with three or four doses of AVA (global P=6.53 × 10(-4)). In particular, carriage of the DRB1-DQA1-DQB1 haplotypes (*)1501-(*)0102-(*)0602 (P=1.17 × 10(-5)), (*)0101-(*)0101-(*)0501 (P=0.009) and (*)0102-(*)0101-(*)0501 (P=0.006) was associated with significantly lower AbPA levels. In carriers of two copies of these haplotypes, lower AbPA levels persisted following subsequent vaccinations. No significant associations were observed amongst African-Americans or for any HLA class I allele/haplotype. Further studies will be required to replicate these findings and to explore the role of host genetic variation outside of the HLA region.


Assuntos
Vacinas contra Antraz/imunologia , Formação de Anticorpos/genética , Antígenos HLA-DQ/genética , Antígenos HLA-DR/genética , Adulto , Idoso , Alelos , Antraz/imunologia , Feminino , Frequência do Gene , Variação Genética , Genótipo , Haplótipos , Antígenos de Histocompatibilidade Classe I/genética , Humanos , Imunoglobulina G/biossíntese , Imunoglobulina G/genética , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único
7.
Bioinformatics ; 27(2): 284-5, 2011 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-21115438

RESUMO

MOTIVATION: Bioinformatics researchers have a variety of programming languages and architectures at their disposal, and recent advances in graphics processing unit (GPU) computing have added a promising new option. However, many performance comparisons inflate the actual advantages of GPU technology. In this study, we carry out a realistic performance evaluation of SNPrank, a network centrality algorithm that ranks single nucleotide polymorhisms (SNPs) based on their importance in the context of a phenotype-specific interaction network. Our goal is to identify the best computational engine for the SNPrank web application and to provide a variety of well-tested implementations of SNPrank for Bioinformaticists to integrate into their research. RESULTS: Using SNP data from the Wellcome Trust Case Control Consortium genome-wide association study of Bipolar Disorder, we compare multiple SNPrank implementations, including Python, Matlab and Java as well as CPU versus GPU implementations. When compared with naïve, single-threaded CPU implementations, the GPU yields a large improvement in the execution time. However, with comparable effort, multi-threaded CPU implementations negate the apparent advantage of GPU implementations. AVAILABILITY: The SNPrank code is open source and available at http://insilico.utulsa.edu/snprank.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Software , Algoritmos , Biologia Computacional , Gráficos por Computador , Computadores , Linguagens de Programação
8.
Front Genet ; 2: 109, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22303403

RESUMO

There is growing evidence that much more of the genome than previously thought is required to explain the heritability of complex phenotypes. Recent studies have demonstrated that numerous common variants from across the genome explain portions of genetic variability, spawning various avenues of research directed at explaining the remaining heritability. This polygenic structure is also the motivation for the growing application of pathway and gene set enrichment techniques, which have yielded promising results. These findings suggest that the coordination of genes in pathways that are known to occur at the gene regulatory level also can be detected at the population level. Although genes in these networks interact in complex ways, most population studies have focused on the additive contribution of common variants and the potential of rare variants to explain additional variation. In this brief review, we discuss the potential to explain additional genetic variation through the agglomeration of multiple gene-gene interactions as well as main effects of common variants in terms of a network paradigm. Just as is the case for single-locus contributions, we expect each gene-gene interaction edge in the network to have a small effect, but these effects may be reinforced through hubs and other connectivity structures in the network. We discuss some of the opportunities and challenges of network methods for analyzing genome-wide association studies (GWAS) such as the study of hubs and motifs, and integrating other types of variation and environmental interactions. Such network approaches may unveil hidden variation in GWAS, improve understanding of mechanisms of disease, and possibly fit into a network paradigm of evolutionary genetics.

9.
Genes Immun ; 11(8): 630-6, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20613780

RESUMO

The variation in antibody response to vaccination likely involves small contributions of numerous genetic variants, such as single-nucleotide polymorphisms (SNPs), which interact in gene networks and pathways. To accumulate the bits of genetic information relevant to the phenotype that are distributed throughout the interaction network, we develop a network eigenvector centrality algorithm (SNPrank) that is sensitive to the weak main effects, gene-gene interactions and small higher-order interactions through hub effects. Analogous to Google PageRank, we interpret the algorithm as the simulation of a random SNP surfer (RSS) that accumulates bits of information in the network through a dynamic probabilistic Markov chain. The transition matrix for the RSS is based on a data-driven genetic association interaction network (GAIN), the nodes of which are SNPs weighted by the main-effect strength and edges weighted by the gene-gene interaction strength. We apply SNPrank to a GAIN analysis of a candidate-gene association study on human immune response to smallpox vaccine. SNPrank implicates a SNP in the retinoid X receptor α (RXRA) gene through a network interaction effect on antibody response. This vitamin A- and D-signaling mediator has been previously implicated in human immune responses, although it would be neglected in a standard analysis because its significance is unremarkable outside the context of its network centrality. This work suggests SNPrank to be a powerful method for identifying network effects in genetic association data and reveals a potential vitamin regulation network association with antibody response.


Assuntos
Formação de Anticorpos/genética , Estudo de Associação Genômica Ampla/métodos , Vacina Antivariólica/imunologia , Algoritmos , Citocromo P-450 CYP1A1/genética , Redes Reguladoras de Genes , Genes , Humanos , Cadeias de Markov , NADPH Oxidases/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Receptor X Retinoide alfa/genética , Vacina Antivariólica/genética
10.
Genes Immun ; 10(2): 112-9, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18923431

RESUMO

Complex clinical outcomes, such as adverse reaction to vaccination, arise from the concerted interactions among the myriad components of a biological system. Therefore, comprehensive etiological models can be developed only through the integrated study of multiple types of experimental data. In this study, we apply this paradigm to high-dimensional genetic and proteomic data collected to elucidate the mechanisms underlying the development of adverse events (AEs) in patients after smallpox vaccination. As vaccination was successful in all of the patients under study, the AE outcomes reported likely represent the result of interactions among immune system components that result in excessive or prolonged immune stimulation. In this study, we examined 1442 genetic variables (single nucleotide polymorphisms) and 108 proteomic variables (serum cytokine concentrations) to model AE risk. To accomplish this daunting analytical task, we employed the Random Forests (RF) method to filter the most important attributes, then we used the selected attributes to build a final decision tree model. This strategy is well suited to integrated analysis, as relevant attributes may be selected from categorical or continuous data. Importantly, RF is a natural approach for studying the type of gene-gene, gene-protein and protein-protein interactions we hypothesize to be involved in the development of clinical AEs. RF importance scores for particular attributes take interactions into account, and there may be interactions across data types. Combining information from previous studies on AEs related to smallpox vaccination with the genetic and proteomic attributes identified by RF, we built a comprehensive model of AE development that includes the cytokines intercellular adhesion molecule-1 (ICAM-1 or CD54), interleukin-10 (IL-10), and colony stimulating factor-3 (CSF-3 or G-CSF) and a genetic polymorphism in the cytokine gene interleukin-4 (IL4). The biological factors included in the model support our hypothesized mechanism for the development of AEs involving prolonged stimulation of inflammatory pathways and an imbalance of normal tissue damage repair pathways. This study shows the utility of RF for such analytical tasks, while both enhancing and reinforcing our working model of AE development after smallpox vaccination.


Assuntos
Citocinas/sangue , Citocinas/genética , Molécula 1 de Adesão Intercelular/sangue , Molécula 1 de Adesão Intercelular/genética , Modelos Biológicos , Polimorfismo de Nucleotídeo Único , Vacina Antivariólica/efeitos adversos , Biomarcadores/sangue , Tomada de Decisões Assistida por Computador , Feminino , Humanos , Inflamação/sangue , Inflamação/induzido quimicamente , Inflamação/genética , Masculino , Proteômica/métodos , Vacina Antivariólica/administração & dosagem , Vacinação
11.
Cancer Inform ; 6: 433-47, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-19259421

RESUMO

An artificial immune system algorithm is introduced in which nonlinear dynamic models are evolved to fit time series of interacting biomolecules. This grammar-based machine learning method learns the structure and parameters of the underlying dynamic model. In silico immunogenetic mechanisms for the generation of model-structure diversity are implemented with the aid of a grammar, which also enforces semantic constraints of the evolved models. The grammar acts as a DNA repair polymerase that can identify recombination and hypermutation signals in the antibody (model) genome. These signals contain information interpretable by the grammar to maintain model context. Grammatical Immune System Evolution (GISE) is applied to a nonlinear system identification problem in which a generalized (nonlinear) dynamic Bayesian model is evolved to fit biologically motivated artificial time-series data. From experimental data, we use GISE to infer an improved kinetic model for the oxidative metabolism of 17beta-estradiol (E(2)), the parent hormone of the estrogen metabolism pathway.

12.
Bioinformatics ; 23(16): 2113-20, 2007 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-17586549

RESUMO

MOTIVATION: The development of genome-wide capabilities for genotyping has led to the practical problem of identifying the minimum subset of genetic variants relevant to the classification of a phenotype. This challenge is especially difficult in the presence of attribute interactions, noise and small sample size. METHODS: Analogous to the physical mechanism of evaporation, we introduce an evaporative cooling (EC) feature selection algorithm that seeks to obtain a subset of attributes with the optimum information temperature (i.e. the least noise). EC uses an attribute quality measure analogous to thermodynamic free energy that combines Relief-F and mutual information to evaporate (i.e. remove) noise features, leaving behind a subset of attributes that contain DNA sequence variations associated with a given phenotype. RESULTS: EC is able to identify functional sequence variations that involve interactions (epistasis) between other sequence variations that influence their association with the phenotype. This ability is demonstrated on simulated genotypic data with attribute interactions and on real genotypic data from individuals who experienced adverse events following smallpox vaccination. The EC formalism allows us to combine information entropy, energy and temperature into a single information free energy attribute quality measure that balances interaction and main effects. AVAILABILITY: Open source software, written in Java, is freely available upon request.


Assuntos
Mapeamento Cromossômico/métodos , Análise Mutacional de DNA/métodos , Bases de Dados Genéticas , Evolução Molecular , Genótipo , Análise de Sequência de DNA/métodos , Sequência de Bases , Simulação por Computador , Modelos Genéticos , Modelos Estatísticos , Dados de Sequência Molecular
13.
Phys Rev E Stat Nonlin Soft Matter Phys ; 73(2 Pt 1): 021912, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16605367

RESUMO

We introduce a grammar-based hybrid approach to reverse engineering nonlinear ordinary differential equation models from observed time series. This hybrid approach combines a genetic algorithm to search the space of model architectures with a Kalman filter to estimate the model parameters. Domain-specific knowledge is used in a context-free grammar to restrict the search space for the functional form of the target model. We find that the hybrid approach outperforms a pure evolutionary algorithm method, and we observe features in the evolution of the dynamical models that correspond with the emergence of favorable model components. We apply the hybrid method to both artificially generated time series and experimentally observed protein levels from subjects who received the smallpox vaccine. From the observed data, we infer a cytokine protein interaction network for an individual's response to the smallpox vaccine.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica/fisiologia , Modelos Biológicos , Transdução de Sinais/fisiologia , Fatores de Transcrição/metabolismo , Animais , Simulação por Computador , Humanos , Dinâmica não Linear , Reconhecimento Automatizado de Padrão , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...